12 research outputs found
Sanskrit Sandhi Splitting using seq2(seq)^2
In Sanskrit, small words (morphemes) are combined to form compound words
through a process known as Sandhi. Sandhi splitting is the process of splitting
a given compound word into its constituent morphemes. Although rules governing
word splitting exists in the language, it is highly challenging to identify the
location of the splits in a compound word. Though existing Sandhi splitting
systems incorporate these pre-defined splitting rules, they have a low accuracy
as the same compound word might be broken down in multiple ways to provide
syntactically correct splits.
In this research, we propose a novel deep learning architecture called Double
Decoder RNN (DD-RNN), which (i) predicts the location of the split(s) with 95%
accuracy, and (ii) predicts the constituent words (learning the Sandhi
splitting rules) with 79.5% accuracy, outperforming the state-of-art by 20%.
Additionally, we show the generalization capability of our deep learning model,
by showing competitive results in the problem of Chinese word segmentation, as
well.Comment: Accepted in EMNLP 201
Hi, how can I help you?: Automating enterprise IT support help desks
Question answering is one of the primary challenges of natural language
understanding. In realizing such a system, providing complex long answers to
questions is a challenging task as opposed to factoid answering as the former
needs context disambiguation. The different methods explored in the literature
can be broadly classified into three categories namely: 1) classification
based, 2) knowledge graph based and 3) retrieval based. Individually, none of
them address the need of an enterprise wide assistance system for an IT support
and maintenance domain. In this domain the variance of answers is large ranging
from factoid to structured operating procedures; the knowledge is present
across heterogeneous data sources like application specific documentation,
ticket management systems and any single technique for a general purpose
assistance is unable to scale for such a landscape. To address this, we have
built a cognitive platform with capabilities adopted for this domain. Further,
we have built a general purpose question answering system leveraging the
platform that can be instantiated for multiple products, technologies in the
support domain. The system uses a novel hybrid answering model that
orchestrates across a deep learning classifier, a knowledge graph based context
disambiguation module and a sophisticated bag-of-words search system. This
orchestration performs context switching for a provided question and also does
a smooth hand-off of the question to a human expert if none of the automated
techniques can provide a confident answer. This system has been deployed across
675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201
Tool for Automated Tax Coding of Invoices
Accounts payable refer to the practice where organizations procure goods and services on credit which need to be reimbursed to the vendors in due time. Once the vendor raises an invoice, it undergoes through a complex process before the final payment. In this process, tax code determination is one of the most challenging steps, which determines the tax to be levied and directly influences the amount payable to a vendor. This step is also very important from a regulatory compliance standpoint. However, it is error-prone, labor (resource) intensive, and needs regular training of the resources as it is done manually. Further, an error in the tax code determination induces penalties on the organization. Automatically arriving at a tax-code for a given product accurately and efficiently is a daunting task. To address this problem, we present an automated end-to-end system for tax code determination which can either be used as a standalone application or can be integrated into an existing invoice processing workflow. The proposed system determines the most relevant tax code for an invoice using attributes such as item description, vendor details, shipping and delivery location. The system has been deployed in production for a multinational consumer goods company for more than 6 months. It has already processed more than 22k items with an accuracy of more than 94% and high confidence prediction accuracy of around 99.54%. Using this system, approximately 73% of all the invoices require no human intervention
Democratization of Deep Learning Using DARVIZ
With an abundance of research papers in deep learning, adoption and reproducibility of existing works becomes a challenge. To make a DL developer life easy, we propose a novel system, DARVIZ, to visually design a DL model using a drag-and-drop framework in an platform agnostic manner. The code could be automatically generated in both Caffe and Keras. DARVIZ could import (i) any existing Caffe code, or (ii) a research paper containing a DL design; extract the design, and present it in visual editor